Skip to content

Conversation

@Jiajun-Ji
Copy link
Contributor

Implements AMX tile operations with system permission handling and comprehensive benchmarking. Includes optimized linalg baseline, AOT compilation chain. Achieves ~2× speedup on Sapphire Rapids 8488C (512×2048×1024 matrices).
FileCheck testing is not supported because AMX tile operations require system-level permissions and AOT compilation; JIT-based testing frameworks like FileCheck cannot initialize AMX state or handle required syscalls.

Implements AMX tile operations with system permission handling and
comprehensive benchmarking. Includes optimized linalg baseline, AOT
compilation chain. Achieves ~2× speedup on Sapphire Rapids 8488C
(512×2048×1024 matrices).
@zhanghb97 zhanghb97 merged commit 7018b99 into buddy-compiler:main Oct 10, 2025
1 check passed
Old-cpu pushed a commit to Old-cpu/buddy-mlir that referenced this pull request Oct 15, 2025
)

* [Example]: Add Intel AMX BF16 matrix multiplication.

Implements AMX tile operations with system permission handling and
comprehensive benchmarking. Includes optimized linalg baseline, AOT
compilation chain. Achieves ~2× speedup on Sapphire Rapids 8488C
(512×2048×1024 matrices).

* format the code and add the licence header.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants